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ABSTRACT 


Sign language (SL) is commonly considered as the primary gesture based 
language for deaf and dumb people. It is a medium of communication for such 
people. Basically image-based and sensor- based are the two important sign 
language recognition methods. Because of the difficulties in wearing complex 
devices like Hand Gloves, armbands, helmets etc. in sensor based approaches, 
lots of researches are done by companies and researchers on image based 
approaches. Sign language is used by these people to communicate with the 
normal people. Understanding this sign language is a difficult task according to 
the normal people. To address these difficulties, a real time translator for sign 
language using deep learning (DL) is introduced. It enables to reduce the 
limitations and cons of other methods to a greater extent. With the help of this 
real time translator, communication will be better and fast without causing 
any delay. 
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INTRODUCTION 

The sign language is a major way of communication for deaf- 
dumb people. They have no ability to speak or hear by 
themselves. The difficulty of communication for deaf and 
dumb people with others is a major issue in the society. In 
sign language each gesture has its own meaning. A third party 
who knows the sign language is necessary for their 
communication. Otherwise there will be no use of third party. 
There may be a chance of communication gap between the 
deaf-dumb people and normal people even in the presence of 
third party. In various fields, they are facing many difficulties 
in communication. Lots of researches are going on image 
based methods. The proposed method of real time translator 
using deep learning can achieve a better recognition 
performance. 

RELATEDWORKS 

Several works can be done in the field of sign language 
recognition. Many researchers have used skin color based 
segmentation for gesture recognition. But it has many 
problems that negatively affect the accuracy of the 
segmentation. Variation in illumination is very difficult to 
accommodate for accurate segmentation. There are other 
methods to acquire input data, such as accelerometer, 
helmets, armbands, sensory gloves. Some uses camera and 
color gloves to acquire the feature they need. These all 
methods suffer with wearing difficulties. The flex sensors are 
planted inside the gloves and provide fingers' flexes. The 
accelerometer provides tilting movement of palms. Another 
system used Leap Motion Controller (LMC) to acquire the 
data. LMC is 20 a touch less controller developed by 
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Technology Company based in San Francisco called Leap 
Motion. It can operate roughly around 200 frames rate per 
second and is able to detect and track hands, fingers, and 
finger-like objects. Most of researchers acquire their training 
data by recording the data from their signer. But most of 
these methods have noisy behaviour. As a result of these 
problems, the accuracy of the gesture recognition is very low. 

H. Brashear et al. [8] proposed multiple sensor types for 
disambiguation of noise in gesture recognition. In this case, 
accelerometers with the three degrees of freedom, mounted 
on the wrists and torso to increase our sensing information 
are used. The accelerometers will capture information that 
the vision system will have difficulty with such as rotation 
(when hand shape looks similar) and vertical movement 
towards or away from the camera. The camera will provide 
information not gathered by the accelerometers such as hand 
shape and position. Both sensors collect information about 
the movement of the hands through space.. It is important to 
add that sensor selection is based on the amount of 
information the sensor collects and its "wear- ability". 

A. G. Jaramillo et al. [5] proposed hand gesture recognition 
with EMG using machine learning. Myo armband is a sensor 
which is used because of the low cost, small size and weight. 
Myo is a small and open source sensor that is easy to wear. 
The Myo armband has eight EMG surface dry sensors, and an 
inertial measurement unit (IMU). The eight surface sensors 
measure 200 samples per second of the electrical activity of 
the muscles. The IMU has 9 degrees of freedom 
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(accelerometer, gyroscope, and orientation in the X, Y, and Z- 
axes). The Myo armband uses Bluetooth technology for 
transmitting the data to the computer. Finally, the Myo 
armband has incorporated a proprietary system capable of 
recognizing five gestures of the hand: pinch, fist, open wave 
in, and wave out. 

A. S. Nikam et al. [4] proposed image based hand gesture 
recognition technique. There are two most basic 
morphological operations: Erosion and Dilation, it uses for 
Removing noise, Separation of individual elements and 
joining misaligned elements in an image, even Finding of 
intensity bumps or holes in an image. Erosion shrinks 
boundaries of an image and enlarges holes. Dilation is used 
to add pixels at region of boundaries or to fill in holes which 
generate during erosion process. It can also be used to 
connect disjoint pixels and add pixels at edges. Tracking is 
mainly used for tracking a hand gesture from capture image 
using Convexity hull algorithm. Finally recognition is done 
with the help of features like convex hull and convex defects 
taken from tracking. 

METHODOLOGY 

The proposed system uses deep learning for sign language 
recognition. It provides a real time translator for sign 
language. Transfer learning is an advanced technique of Deep 
Learning where a model developed for a task is used as a 
starting point for a model on a second context similar task. 
The proposed system focuses on removing the barrier of 
communication between normal and physically disabled 
people. Images of various hand gestures are collected for 
training purpose. After the dataset preparation, images of 
hand gestures are trained. Inception model which is the 
summation of dataset is created and stored as file. It is a 
graphical format which is not human readable which is 
loaded into memory when required. By using deep learning 
with the help of tensorflow platform, the input imagescan be 
processed 

The person conveying in the sign language can be obtained by 
using a web camera in a video format. The hand gestures are 
made by signer in rectangular box which is visible on the 
screen. It is able to capture the hand gestures in a right 
manner. The hand gestures to be recognized is loaded into 
tensor flow memory. It passes through the inception model 
which is also loaded into tensor flow memory. KNN, K 
Nearest Neighbor is used as classification algorithm. Using 
this algorithm, the output corresponding to the gesture is 
obtained as text. The modules involved in the proposed 
method are the following: 

A. Dataset Collection 

B. Training Images 

C. Image Acquisition 

D. Recognition of gestures 

A. Dataset Collection 

The first module of the proposed system is dataset collection 
module. Using the Sign Language MNIST dataset from Kaggle, 
the proposed model is evaluated to classify hand gestures for 
each letter of the alphabet. The gesture images of each 
alphabet are collected and stored as folders. A large set of 
hand gesture images of each alphabet are included. Along 
with these, there are gestures for backspace, space and some 
words also. The gestures for backspace are used to clear the 
wrong one while translating the gestures. The gestures for 


space are used to keep the gap between two different words 
during the appearance of the text. The image preprocessing 
steps included conversion of the images to required image 
format using Python's open-source libraries Pandas, NumPy 
and others to obtain PNG format 28x28 gray scale images. 
First load the dataset using pandas. Pandas are the package 
used to load the documents (in any format). All the 
operations can be performed by using NumPy module in the 
python. 



Figurel. Sign Language 


B. Training Images 

Transfer learning is a machine learning method which uses a 
pre-trained neural network. The Inception V3 is a 
convolutional neural network .The Inception model is 
retrained on the mentioned dataset From the dataset, the 
classification of data is as follows, 10% data is used for testing 
purposes, 10% data is used for validation purposes and 80% 
data is used for training. 

C. Image Acquisition 

The main device used for taking the input images in Sign 
Language Recognition (SLR) is web camera. The proposed 
method used a webcam for capturing the image and then 
stored in a directory. Signer must be ready to perform sign 
language hand gesture before camera getting on. The 
gestures are made in bounding boxes that will appear on the 
screen. It enables to capture the hand portion of the signer 
who is trying to communicate with the others. The proposed 
system is a real time translator for sign language; therefore 
there must be not a communication gap. So to process the 
hand gesture portion only, gestures are made in bounding 
boxes 

The video of hand gestures are captured using the web 
camera. From these video sequences, the images are acquired 
automatically as frames. These input images of gestures are 
then passed into the tensorflow memory. 



Figure2. Image Acquisition 
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D. Recognition of gestures 

The hand gesture images as frames are loaded into the 
tensorflow memory. At the same time, inception model which 
is the summation of the dataset is already present in the 
tensorflow memory. Then there is a need of classification 
algorithm for classification.KNN (K-Nearest Neighbor) is a 
classification algorithm used for the purpose of the 
classification. It is a non-parametric method used for 
classification and regression. KNN is asupervised machine 
learning algorithm 

EXPERIMENTAL ANALYSIS 
A. Result and Analysis 

This section discusses the experimental results of the 
proposed system. Visual Studio Code is used. The dataset 
from Kaggle is used for training purpose. The proposed 
system is implemented using four modules i.e., dataset 
preparation, training images, image acquisition and 
recognition of gestures. The hand gestures are made by the 
signer in the rectangular frame which is visible on the 
screen. 



B. Performance Requirements 

Performance requirements describe all the hardware 
specifications of the system. For deep learning project, the 
minimum system requirement is it needs i3 or above 
processor. Mostly deep learning projects used 64 bit Ubuntu 
operating system. 


CONCLUSION 

Sign language recognition system proposed for the 
communication of deaf-dumb people using deep learning 
technique was implemented successfully with better 
accuracy. We have tried to improve the recognition rate 
compared to the previous works and achieved a better 
success rate. The proposed method could develop hand 
gesture recognition model successfully which enables to 
recognize dozens of gestures of the hand, with recognition 
accuracy greater than the existing real-time models. 
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